Fast Arabic Glyph Recognizer based on Haar Cascade Classifiers

نویسندگان

  • Ashraf AbdelRaouf
  • Colin Higgins
  • Tony P. Pridmore
  • Mahmoud I. Khalil
چکیده

Optical Character Recognition (OCR) is an important technology. The Arabic language lacks both the variety of OCR systems and the depth of research relative to Roman scripts. A machine learning, HaarCascade classifier (HCC) approach was introduced by Viola and Jones (Viola and Jones 2001) to achieve rapid object detection based on a boosted cascade Haar-like features. Here, that approach is modified for the first time to suit Arabic glyph recognition. The HCC approach eliminates problematic steps in the preprocessing and recognition phases and, most importantly, the character segmentation stage. A recognizer was produced for each of the 61 Arabic glyphs that exist after the removal of diacritical marks. These recognizers were trained and tested on some 2,000 images each. The system was tested with real text images and produces a recognition rate for Arabic glyphs of 87%. The proposed method is fast, with an average document recognition time of 14.7 seconds compared with 15.8 seconds for commercial software.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Multi-Stage Approach to Fast Face Detection

A multi-stage approach — which is fast, robust and easy to train — for a face-detection system is proposed. Motivated by the work of Viola and Jones [1], this approach uses a cascade of classifiers to yield a coarse-to-fine strategy to reduce significantly detection time while maintaining a high detection rate. However, it is distinguished from previous work by two features. First, a new stage ...

متن کامل

Squiggle - A Glyph Recognizer for Gesture Input

Squiggle is a template-based glyph recognizer in the lineage of “$1 Recognizer”[1] and “Protractor”[2]. It seeks a good fit linear affine mapping between the input and template glyphs which are represented as a list of milestone points along the glyph path. The algorithm can recognize input glyphs invariant of rotation, scaling, skew, and reflection symmetries. In practice the algorithm is fast...

متن کامل

MULTIBOOST: A Multi-purpose Boosting Package

The MULTIBOOST package provides a fast C++ implementation of multi-class/multi-label/multitask boosting algorithms. It is based on ADABOOST.MH but it also implements popular cascade classifiers and FILTERBOOST. The package contains common multi-class base learners (stumps, trees, products, Haar filters). Further base learners and strong learners following the boosting paradigm can be easily imp...

متن کامل

The Architecture of the Face and Eyes Detection System Based on Cascade Classifiers

The precise face and eyes detection is crucial in many Human-Machine Interface system. The important issue is the reliable object detection method. In this paper we present the architecture of a 3-stage face and eye detection system based on the Haar Cascade Classifiers. By applying the proposed system to the set of 10000 test images the 94% of the eyes were properly detected and precisely loca...

متن کامل

Real-Timely Detecting License Plate under Various Conditions

This paper proposes a learning-based algorithm for real-time license plate detection. Two kinds of features, statistical gradient features and Haar-like features, are used in the algorithm. Firstly, two statistical features are extracted from vertical gradient images. Classifiers based on these two features are constructed through simple learning procedures respectively. Using these classifiers...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014